probabilistic linear solver
Probabilistic Linear Solvers for Machine Learning
Linear systems are the bedrock of virtually all numerical computation. Machine learning poses specific challenges for the solution of such systems due to their scale, characteristic structure, stochasticity and the central role of uncertainty in the field. Unifying earlier work we propose a class of probabilistic linear solvers which jointly infer the matrix, its inverse and the solution from matrix-vector product observations. This class emerges from a fundamental set of desiderata which constrains the space of possible algorithms and recovers the method of conjugate gradients under certain conditions. We demonstrate how to incorporate prior spectral information in order to calibrate uncertainty and experimentally showcase the potential of such solvers for machine learning.
Learning to Solve Related Linear Systems
Solving multiple parametrised related systems is an essential component of many numerical tasks. Borrowing strength from the solved systems and learning will make this process faster. In this work, we propose a novel probabilistic linear solver over the parameter space. This leverages information from the solved linear systems in a regression setting to provide an efficient posterior mean and covariance. We advocate using this as companion regression model for the preconditioned conjugate gradient method, and discuss the favourable properties of the posterior mean and covariance as the initial guess and preconditioner. We also provide several design choices for this companion solver. Numerical experiments showcase the benefits of using our novel solver in a hyperparameter optimisation problem.
Review for NeurIPS paper: Probabilistic Linear Solvers for Machine Learning
Strengths: EDIT after rebuttal: Thank you authors for clarifying the following: - GP regression on log(Rayleigh_i): satisfactory reply, this algorithm takes into account uncertainty about eigenvalues beyond t 1. - Transfer learning: reusing the posterior covariance as a prior makes the method converge faster than if just the mean is reused. I'm still confused about this, but a little bit less: - Empirical Bayes: is indeed common, and in many applications the prior is updated as more data comes in. For example, in Bayesian optimization, after acquiring an extra point the GP hyperparameters are re-optimized. However, the weird thing here, which the authors clarified in the rebuttal, is that the prior used at each time step *contains future observations in it*. Does this imply that the posterior covariance is impossible to calculate in the middle of the algorithm, before it is terminated and thus we have the full S matrix?
Review for NeurIPS paper: Probabilistic Linear Solvers for Machine Learning
The paper proposes a new probabilistic solver for linear systems and shows that it can improve uncertainty quantification for linear solvers which is fundamental building block that's widely used in machine learning. This is a well-written paper, I particularly enjoyed Table 1 which helps situate the work in the wider literature. The author rebuttal addresses most of the major concerns and all reviewers lean towards accept in the final discussion.
Probabilistic Linear Solvers for Machine Learning
Linear systems are the bedrock of virtually all numerical computation. Machine learning poses specific challenges for the solution of such systems due to their scale, characteristic structure, stochasticity and the central role of uncertainty in the field. Unifying earlier work we propose a class of probabilistic linear solvers which jointly infer the matrix, its inverse and the solution from matrix-vector product observations. This class emerges from a fundamental set of desiderata which constrains the space of possible algorithms and recovers the method of conjugate gradients under certain conditions. We demonstrate how to incorporate prior spectral information in order to calibrate uncertainty and experimentally showcase the potential of such solvers for machine learning.